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Abstract 

In this work, we obtain the following new results. 

- Given a sequence D = {{hi, si), (/i2, S2) ■ ■ ■ ■, {hn, s™)) of number pairs, where Sj > 
for all i, and a number L^, we propose an 0(n)-time algorithm for finding an index 

interval \i,j] that maximizes subject to JZi-^ hk > L^- 

- Given a sequence D = {{hi, si), (/i2, 52)..., {hn, s„)) of number pairs, where Si = 1 
for all i, and an integer with 1 < Lg < n, we propose an 0{n ^ {'^^ )-time 

j 



algorithm for finding an index interval that maximizes ^ subject to 

VEfc=» "k 

Ylk=i^k > Lg, where T{n') is the time required to solve the all-pairs shortest 
paths problem on a graph of n' nodes. By the latest result of Chan [8j, T{n') = 
0(n^^ ), so our algorithm runs in subquadratic time 0{nLs^^^^^^^^). 

1 Introduction 



Given a sequence D = {{hi, si), (/12, S2) . . . , {hn, s„)) of number pairs, where Sj > for all i, 
define the support, hit-support, confidence, eccentricity , and aberrance of an index interval 
/ = = {t, ,j} to be EL. Sk, Ei=^ hk, and ^f^, respectively. 

Denote by sup{i,j), hit{i,j), conf{i,j), ecc{i,j), and aberr{i,j) the support, hit-support, con- 
fidence, eccentricity, and aberrance of index interval / = [i,j], respectively. The sequence D 
is said to be plain if and only if Sj = 1 for all i. An index interval / = [i,j] is said to be 
amble with respect to a support lower bound Lg if sup{i,j) > Lg. An index interval I = [i,j] 

*An earlier version of the second part of this work appeared in Proceedings of the 18th International Sym- 
posium on Algorithms and Computation, Japan, 2007. 
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is said to be endorsed with respect to a hit-support lower bound if hit{i,j) > L^. An 
index interval / = is said to be confident with respect to a confidence lower bound L^. 
if conf{i,j) > Lc- Consider the following problems arising in association rule mining flEl fT6] . 
computational biology [aEllSlinillllllSlII^ITHlIiniEniEIlEH], and statistics [12] . 

- Hit-Constrained Max Confidence Interval (HCI) Problem: Given a sequence 
D = {{hi, si), {h2, S2), . . . , {hn, Sn)) of number pairs, where Sj > for all i, and a hit- 
support lower bound Lh, find an endorsed interval I = maximizing the confidence 
conf{i, j). Bernholt et al. [5]'s results imply an 0{n log ?T,)-time algorithm for this problem, 
and we give an 0(n)-time algorithm in this paper. 

- Plain Support-Constrained Max Eccentricity Interval (PSEI) Problem: 
Given a plain sequence D = {{hi, si), {h2, S2), . . . , {hn, s„)) of number pairs, where Sj = 1 
for all i, and a support lower bound Lg with 1 < Lg < n, find an amble interval / = 
maximizing the eccentricity ecc{i,j). Lipson et al. [22] proposed an approximation scheme 
for the case Lg = 1. Specifically, given an e G (0,1/5], their algorithm guarantees to 
outputs an index interval [i,j] such that ecc{i,j) is at least Opt/a(e) in 0(ne~^) time, 
where Opt = max{ecc{i' , j') ■ 1 < i' < j' < n} and a{e) = (1 — A/2e(2 + e))~"^. In this 
paper, we propose an 0{n ^ )-time algorithm for this problem, where T{n') is the 
time required to solve the all-pairs shortest paths problem on a graph of n' nodes. By the 
latest result of Chan [8], T{n') = 0{n'^ ^^°f^°^J^i ), so our algorithm runs in subquadratic 
time 0(nL,^i2gg#). To the best of our knowledge, it is the first subquadratic result 
for this problem. 

- Confidence- Constrained Max Hit Interval (CHI) Problem: Given a sequence 
D = {{hi, si), (/12, S2), . . . , {hn, Sn)) of number pairs, where Sj > for all i, and a con- 
fidence lower bound Lc, find a confident interval I = maximizing the hit-support 
hit{i,j). Bernholt et al. [5]'s results imply an 0(n logn)-time algorithm for this problem, 
and recently, Cheng et al. [10] obtained an 0(n)-time algorithm. 

- Support-Constrained Max Confidence Interval (SCI) Problem: Given a se- 
quence D = {{hi, si), (/12, S2), . . . , {hn, Sn)) of number pairs, where Si > for all i, and 
a support lower bound Lg, find an amble interval / = maximizing the confidence 
conf{i,j). This problem was studied in [3, HH [TSl [IZl dSl [ISl EOl EI] and can be solved 
in 0{n) time [3 [IB [13 [13 [20] ■ 

- Confidence- Constrained Max Support Interval (CSI) Problem: Given a se- 
quence D = {{hi. Si), (/i2, S2), . . . , {hn, Sn)) of number pairs, where > for all i, and 
a confidence lower bound L^, find a confident interval / = [i,j] maximizing the support 
sup{i,j). This problem was studied in [21 [5], [9l [13 [28] and can be solved in 0{n) time [T5] . 
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- Support- Constrained Max Aberrance Interval (SAI) Problem: Given a se- 
quence D = {{hi, si), {h2, S2), ■ ■ ■ , {hn, Sn)) of number pairs, where Sj > for all i, a 
support lower bound Lg, and a support upper bound Us, find an index interval / = 
maximizing the aberrance aberr{i,j) subject to Lg < sup{i,j) < Us- Bernholt et al. [5] 
proposed an 0(n)-time for this problem. 

- Support-Constrained Max Hit Interval (SHI) Problem: Given a sequence D = 
{{hi, Si), {h2, S2), ■ . . , {hn, Sn)) of uumbcr pairs, where Sj > for all i, and a support lower 
bound Lg, find an amble interval / = maximizing the hit-support hit{i,j). This 
problem was solvable in 0{n) time by algorithms in [5l fT3l [2T] . 

Results for these problems are summarized in Table [1] The rest of this paper is organized 
as follows. In Section 2, we give a linear-time algorithm for the HCI problem, which is an 
adaption of the algorithm by Chung and Lu pj]. In Section 3, we give the first subquadratic 
time algorithm for the PSEI problem. Finally, we close the paper by mentioning a few open 
problems. 

2 A linear time algorithm for the HCI problem 

For ease of exposition, we assume Lh > in subsequent discussion. The restriction can 
be overcame as follows. If < and hi > for some i, then it is safe to reset 
to 0. Otherwise, if < and hi < for all i, let D' = {{h\, s'l) , {h'^, s'2) , . . . , {h'^, s'^)) = 
{{si, —hi), {s2, —h2), ■ . ■ , {sn, —hn)) and Us = —Lh- The problem is then reduced to finding an 
index interval I = that maximizes f subject to Ylk=i^'k — ^s, which is solvable in 
0{n) time in an online manner by Chung and Lu's algorithm [TT] . 

2.1 Preliminaries 

Let H = {hi, h2, . . . , hn) and Pff[0..n] be the prefix-sum array of H, where Ph[0] = and 
Pff [i] = Pf/ [z — 1] + /ij for each i = 1,2, . . . ,n. Let S = {si, S2, ■ ■ ■ , s„) and Ps[0..n] be the prefix- 
sum array of S, where -PglO] = ^^^d Ps[i] = -Psl^ ~ 1] + Si for each i = 1,2, ... ,n. Both Ph and 
Ps can be computed in 0{n) time in an online manner. Note that hit{i,j) = Pf/[i] — PhU " 1], 
sup{i,j) = Ps\i] - Ps[j - 1], and conf{i,j) = ^{-jlp^^^^Tj' • Therefore, by keeping Ph and Ps, 
each computation of the hit-support, support, or confidence of an index interval can be done 
in constant time. We next introduce the notion of partners. For technical reasons, we define 
hit{0,p) = Ls and conf{0,p) = —00 for all indices p. 

^An algorithm is said to run in an online manner if and only if it can process its input piece-by-piece and 
maintain a solution for the pieces processed so far. 
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Table 1: Results on problems of finding constrained optimal intervals. 



Paper 


Problem 


Time 


Bernholt et al. [5j / 


HCI 


O(nlogn) 


This paper 


HCI 


0{n) 


Lipson et al. [22j 


PSEI 


Approximation Scheme for = 1 


This paper 


PSEI 




Bernholt et al. [5] / 


CHI 


0(n logn) 


Cheng et al. [10] 


CHI 


0(n) 


Huang m\ 


SCI 




Fukuda et al. |15j 


SCI 


0(n) 


Lin et al. ED 

I i 


SCI 


0(n\ogLsY 


Kim et al. [19] 


SCI 


0(n) 


Chung & Lu HH / 


SCI 


0(n) 


Goldwasser et al. [17] / 


SCI 


0{n) 


Bernholt et al. [5] / 


SCI 


0{n) 


Lee et al. W\ / 

E 1 


SCI 


0(n) 


Fukuda et al. [T5J 


CSI 


0{n) 


Allison [2] 


CSI 




Wang & Xu [2g 


CSI 


0{n) 


Chen & Chao [9] / 


CSI 


0{n)* 


Bernholt et aZ. [5J / 


CSI 


0{n) if hi e {0, 1} k Si = l for all i 


Bernholt et al. pj / 


SAI 


0{n) 


Lin [21] 


SHI 


0{n) 


Fan [13] 


SHI 


0{n) 


Bernholt et aZ. [5] / 


SHI 


0{n) 



/As a matter of fact, they solved more general problems than we list. 
* The time bounds hold provided that the input sequence D is plain. 



Definition 1: Given an index g, an nonnegative integer p is said to be a partner of q if and 
only if hit{p, q) > Lh- 

Definition 2: Given an index q, an integer p is said to be the best partner TTq of q if and only 
if p is the largest partner of q such that conf{p, q) = max{con/(2, q) : i is a partner of q}. 

Definition 3: Denote by the right most partner (i.e., the largest partner) of index q. Define 
the ideal right most partner of index g as fg = max rp. 

l<p<q 

Definition 4: An index q is said to be a good index if and only if fq = rq. 

Let q* be an index that maximizes confliiq*, q*). If vr^. = 0, then there is no endorsed inter- 
val; otherwise, index interval [vr^. , q*] is a maximum-confidence endorsed interval. Therefore, 
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to solve the HCI problem, it suffices to find index q*. The next lemma enable us to efiminate 
bad indices from consideration. 

Lemma 1: If g is a bad index, then index interval [tt^, q] is not a maximum-confidence endorsed 
interval. 

Proof: If g is a bad index, then there exist p < q such that Vg < fg = Vp. Since hit{p, Tq) > 
and hit{q,fg) < L^, we have hp > hg. It follows that if index interval [7rg,g] is an endorsed 
interval, then index interval [tt^,]?] is an endorsed interval and conf{T:g,p) > conf{7ig, q). Thus, 
index interval [ng, q] is impossible to be a maximum-confidence endorsed interval. □ 



2.2 Subroutine 

We next give a subroutine to compute Vg for all good indices q in 0{n) time. The pseudocode is 
given below, where the goal is to fill in an array initialized with -I's, such that R[i] — Ti 

for all good indices i and R[i] — —1 for all bad indices i at the end. 

Subroutine RMP 

1: create an array i?[l..n] initialized with -I's; 
2: ^ ^ 0; 

3: create an empty list C; 
4: for z 1 to n do 

5: while C is not empty and hit{C .lastElementi) -|- < do 

6: delete from C its last element; 

7: end while 

8: insert i at the end of C; 

9: if hit{R,i) > Lh or hit{C.firstElement{),i) > Lh then 

10: while C is not empty and hit{C.firstElement(),i) > do 

11: R <— Cf irst Element {); 

12: delete from C its first element; 

13: end while 

14: R[i] ^ R; 

15: end if 

16: end for 

17: output i?[l..n]; 

The Subroutine RMP consists of n iterations, in the i*'* iteration, R[i] is reset to rj if i is 
an good index. To accomplish this task efficiently, we maintain a list C and a variable R such 
that at the end of the i*'^ iteration the following conditions hold. 
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1. For each two adjacent elements p < q in C, hit{p + l,q) > 0. 

2. For any index q > i, ii Vg E [R, i], then e C U {R}. 

3. R^Ti. 

It is clear that R[l] = ri = and the three conditions hold at the end of the first iteration of 
the for-loop. Suppose that the three conditions hold at the end of the {i — 1)*^ iteration of the 
for-loop. We shall prove that R[i] will be reset to in the i*^ iteration of the for-loop if and 
only if i is a good index and the three conditions hold at the end of the i*^ iteration of the 
for-loop. Consider the moment immediately before the execution of line 9 in the i*'* iteration 
of the for-loop. It is clear that condition 1 still holds at this moment. We next prove that 
condition 2 also holds at this moment. Suppose for contradiction that some r^, where q > i, is 
removed from C during the execution of the while-loop of lines 4 ~ 7. A necessary condition 
for Tg to be deleted is hit{rq + < 0. It follows that hit{i,q) > hit{rg,q) > Lh, so Vg is not 
the right most partner of q, a contradiction. Therefore, we have conditions 1 and 2 hold and 
R = fi^i just before the execution of line 9. We next examine the execution of lines 9 ~ 15. 
Consider the following three cases. 

Case 1: r » ^ C U {R} just before the execution of line 9. By condition 2, we have < fj_i, 
so index i is not good and fj_i = fj. Thus, condition 3 holds just before the execution of 
line 9. In this case we will fail the test condition in line 9 so lines 10 ~ 14 will not be executed. 
Therefore, R[i] is not reset, and conditions 1 ~ 3 hold at the end of the i^^ iteration of the 
for-loop. 

Case 2: = R just before the execution of line 9. It follows that index i is good and 
fj_i = Ti = fi. Thus, condition 3 holds just before the execution of line 9. The body of the 
while-loop of lines 10 ~ 13 will not be executed in this case. Therefore, R[i] is reset to R = 
in line 14, and conditions 1 ~ 3 hold at the end of the i*^ iteration of the for-loop. 

Case 3: G C just before the execution of line 9. It follows that > fj_i, so fj = and 
index i is good. By conditions 1, for all c G C before r^, hit{c,i) > hit{ri,i) > L^. Moreover, 
since is the right most partner of i and indices in C are in increasing order, for all c' E C 
after r,, hit{c',i) < Lh- Therefore, we have R = ri holds after the execution of the while-loop 
of lines 10 ~ 13. It follows that condition 3 holds at the end of the i*^ iteration of the for-loop 
since Vi = fi. It is clear that deleting from C a prefix has no harm to condition 1, so it remains 
to prove that condition 2 still holds. Suppose for contradiction that condition 2 does not hold 
at the end of the i*'* iteration of the for-loop. It follows that some Vg with Vi < Vg < i is removed 
from C, which leads to a contradiction because is the largest index removed from C in the 
while-loop of lines 10 ~ 13. 

We next analyze the running time. In each iteration of the while-loop of lines 5 ~ 7 and the 
while-loop of lines 10 ~ 13, there is one index in C removed. Since each index is inserted into 
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C at most once, the time spent on these two while- loops is bounded by 0{n). To summarize, 
we have the following lemma. 

Lemma 2: Subroutine RMP computes in 0{n) time an array such that R[i] is the 

right most partner of i if i is a good index and R[i] = —1 if i is a bad index. 

Denote by (j){x,y) the largest z in [x,y] that minimizes conf{x,z). We next describe a 
subroutine which finds the largest p G [l,u] that maximizes conf{p,q) given < / < m < g. 
The pseudocode is given below, where we initialize a variable p with value / and then repeatedly 
resetting p to (f){p, u — 1) + 1 until p = u or the lowest confidence interval starting from p and 
ending before u has the same confidence as interval [p, q] . The pseudocode is given below. 

Subroutine BEST(/,M,g) 
1: p^l; 

2: while p < u and conf{p, (f){p, u — 1)) < conf{p, q) do 

3: p M - 1) + 1; 

4: end while 

5: output p] 

Lemma 3: [11] The call to BEST(/,M,g) will return the largest p G [Z,m] that maximizes 
confip, q) if < I < u < q. 

Lemma 4: Let p be the return value of the call to BEST(/, r^, g). Then p = vr^ if vr^ G [l,rg\ 
and < I < Vg. 

Proof: Suppose that p is not a partner of q, i.e., hit{p,q) < Lh < hit{rq,q). It follows that 
conf{p, q) < conf{rq, q), which contradicts LemmaO Thus, p must be a partner of q. Suppose 
for contradiction that p ^ vr^. Since p is a partner of g, by the definition of best partners, we 
have either {conf{p,q) < conf{jq,q)) or {conf{p,q) = conf{ng,q) and p < Hg), which contra- 
dicts Lemma [3l □ 

Chung and Lu [H] also gave efficient implementations of Subroutine BEST in their paper, 
which directly implies the next lemma. 

Lemma 5: yjj A sequence of consecutive calls to Subroutine BEST, say BEST(/i, mi, gi), 
BEST(/2, U2, g2),. . ., BEST(/fc, Uk, qk), can be completed in total 0{uk + k) time provided that 
li = 0, li = BEST {I i- 1, Ui- 1, qi-i) for each i = 2, 3, . . . , fc. Mi < ^2 < ■ ■ ■ < Mfc, and Ui < qi for 
each i = 1,2, ... ,k. 
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2.3 Algorithm 

Our algorithm for the HCI problem is as follows. First, we initialize a variable / with value 
and call Subroutine RMP to compute an array -R[l..n] such that R[i] = rj if i is a good index 
and R[i] = —1 if z is a bad index. Then, for each good index q, taken in increasing order, 
call BEST(/, _R[g], g) to compute the largest Ig G [/,-R[g]] that maximizes conf{lq,q) and reset 
variable / to Iq. Finally, the index interval [Iq, q] that maximizes conf{lq, q) is returned. The 
pseudocode is given below. 

Algorithm ComputeHCI 

Input: A sequence D = {{hi, si), (/i2, S2), ■ ■ ■ , {hn, s„)) of number pairs, where s, > for all i, 

and a hit-support lower bound L^. 
Output: An index interval I = maximizing conf{i,j) subject to hit{i,j) > Lh- 

1: / ^ 0; 

2: R^ call Subroutine RMP; 

4: (a,/5) = (0,0); 

5: for q ^ 1 to n do 

6: if R[q] ^ -1 then 

7: I ^ BEST {I, R[q],q); 

8: if conf{l,q) > Cmax then 

9: Cmax conf{l,q); 

10: {a,(3) ^ {l,q); 

11: end if 
12: end if 
13: end for 
14: output [a, 

Theorem 1: Algorithm ComputeHCI solves the HCI problem in 0{n) time. 

Proof: We first prove the correctness. Let Q = {qi,q2, ■ ■ ■ yQk} be the set of good indices, 
where qi < q2 - ■ ■ < qu and k = \Q\. Let Iq^ = and Iq^ be the return value of the call to 
Subroutine BEST in the g*^ iteration of the for-loop, i = 1,2, . . . , k. Note that Iq. is the largest 
integer in [/gi_i,TgJ that maximizes conf{lq^,q) for all i with 1 < i < k. Let q^* be the good 
index that maximizes conf{7iq^,,qi*). By Lemma [Tj it suffices to prove that vr^., = Iq.^,. To 
prove TCq^, = Iq^,, by Lemma IH it suffices to prove that iiq^, G [^qi*_i,f^qi*], i-e., vr^., > Iq^,^^- 
Suppose for contradiction that vr^^, < lq^,_^- Let qt < qi*-i be the first good index such that 
T^Qi* < Iqt- Then we have TTg^. G [lq^_-^,rq^] and by Lemma [3|, Iq^ is the largest index in [/gt_i,rqj 
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that maximizes conf{lg^,q). It follows that 

confiTTq^.Jg^ - 1) < conf{lq,, qt). (1) 

Since conf{lq^,qt) > conf{rq^,qt) > sup(r1,M) - ^ ^'^Pi^qt, Qt) > sup{rq^,qt), we have 
hit{lqt, qt) > hit{rq^, qt) > Lh- Therefore, Iq^ is a partner of qt and we have 

conf {lq,,qt)< conf {7Tq,,qt). (2) 

Following from inequality (1), we have conf{lq^, qt) < conf{qt+l, qi*) for otherwise confinq^, qt) < 
conf{Trq.,,qi*) < conf{lq^,qt), which contradicts inequality (2). Combining inequality (1) with 
conf{lq^,qt) < confiqt + 1, qi*), we have conf{nq^, , Iq^ - 1) < conf{lq^,qt) < conf{qt + 1, qi*). It 
follows that 

conf {■nq^,,qi*) < conf {lq,,qi'). (3) 
By inequality (3), < ^^p^^^"^ ^g^^) < conf{rq^,,qi*), and conf{rq^,,qi*) < conf{iTq^,,qi,), we have 

< conf{rq^,,qi,) < conf{lq^,qi,). (4) 

Since Iq^ < Vq^ < Vq^, < qi*, we have sup{r q^, , qi*) < sup{lq^, qi*). By inequality (4) and 
sup{rq^,,qi,) < sup{lq^,qi*), we have 

Lh < hit{rq^, ,qi*) < hit{lq^ , g^. ) . (5) 

By inequalities (3) and (5), Iq^ is a partner of q^* and conf{nq^,,qi*) < conf{lq^,qi*), which 
contradicts the definition of best partners. 

We now analyze the time complexity. By Lemma [21 the call to RMP takes 0{n) time. 
By the definition of good indices, we have R[qi] < R[q2] < ••■ < R[(lk]- Because we also 
have Iq. = BEST(/g^_-^, gj) for each i = l,2,...,k, by Lemma [5], the calls to Subrou- 
tine BEST, i.e., BEST(/qo, R[qi], qi), BEST{lq^, R[q2], 52), • • • , and BEST{lqi^_^, R[qk], qk), totally 
take 0{R[qk\ + k) = 0{n) time. □ 

Finally, we modify Algorithm ComputeHCR such that it runs in an online manner. The 
modified version is given below, where we maintain an integer pair {a, (3) such that after pro- 
cessing the q^^ number pair in D at the g*^ iteration of the for-loop, the integer interval [a, f3] will 
be a maximum-confidence endorsed interval for the subsequence {{hi, si), (/i2, S2), ■ ■ ■ , {hq, Sq)), 
if any. The correctness is easy to verify by noting that r = R[q] holds at the end of the g*'' 
iteration of the for-loop for each g = l,2,...,r;,. 

Theorem 2: Algorithm OnlineComputeHCR solves the HCI problem in 0{n) time in an 
online manner. 
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Algorithm OnlineComputeHCI 

Input: A sequence D — {{hi, Si), {h2, S2), ■ ■ ■ , {hn, of number pairs, where Sj > for all i, 
and a hit-support lower bound Lh- 

Goal: Maintaining an integer pair {a, (3) such that after processing the g*'* number pair in D 
at the 5*'* iteration of the for-loop, the integer interval [a, (3] is a maximum-confidence 
endorsed interval for the subsequence {{hi, Si), {h2, S2), ■ ■ ■ , {hg, Sg)), if any. 

1: 1^0; 

2: r < 1; 

3: r ^ 0; 

5: {a,(3)^ {0,0); 
6: create an empty list C; 
7: for g 1 to n do 
8: r< 1; 

9: while C is not empty and hit{C.lastElement{) -|- 1, < do 
10: delete from C its last element; 
11: end while 
12: insert q at the end of C; 

13: if hit{r,q) > Lh or hit{C.firstElement{),q) > Lh then 

14: while C is not empty and hit{C.firstElement{), q) > Lh do 

15: r C.firstElement{); 

16: delete from C its first element; 

17: end while 

18: r <— r; 

19: end if 

20: if r 7^ -1 then 

21: I ^BEST{l,r,q); 

22: if conf{l, q) > Cmax then 

23: Cmax ^ conf{l,q); 

24: {a,P)^{l,q); 

25: end if 

26: end if 

27: end for 
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3 A subqurdratic time algorithm for the PSEI problem 



Define the length of an index interval [i,j] to be length{i, j) = j — i -\- 1. In the PSEI problem, 
the input sequence D is plain, so we have length{i, j) = sup{i,j) = j — i + 1 and ecc{i,j) = 



hit{i,j) 



Thus, we can reformulate the PSEI problem as follows: Given a sequence D 



^Jlength{i,j) 

{{hi, Si), (/i2, S2), ■ ■ ■ , {hn, Sn)) of uumbcr pairs, where Si = 1 for all i, and a length lower bound 
L = Ls, find an index interval / = maximizing the eccentricity ecc{i,j) = _ iiMMi= 

y/length{i,j) 

subject to length{i, j) > L. 
3.1 Preliminaries 

Let H = {hi,h2, . . . ,hn) and PH'[0..n] be the prefix-sum array of H, where Ph[^] = and 
Pf/[i] = Pf/[i — \] + hi for each i = 1,2, ...,n. Note that hit{i,j) = Pnii] — PhU — 1], 
length{i, j) = j — i + 1 and ecc{i,j) = ^^^^J-^^-i^^^ • Thus, after constructing Pjj in 0{n) time, 
each computation of the hit-support, length, or eccentricity of an index interval can be done 
in constant time. In the following, we review some definitions and theorems. For more details, 
readers can refer to [H [5l [6|, HH [27] . 

Definition 5: A function / : x M ^ M is said to be quasiconvex if and only if for all 
points u, 1; G IR+ X M and all A G [0, 1], we have /(A ■ u + {1 — X) ■ v) < max{/(M), f{v)}. 

Lemma 6: [6j Define / : x M by letting 

1^ otherwise. 

Then / is quasiconvex. 

Theorem 3: [5] Given a sequence of n number pairs D = {{hi, si), {h2, S2), ■ ■ ■ , {hn, s„)), a 
length lower bound L, and a quasiconvex score function / : x M ^ M, there exists an 
algorithm, denoted by MS1{D, L, f), which can find an index interval that maximizes 
f{length{i, j), hit{i, j)) subject to length{i, j) > L in 0{n) time. 

By the fact f{i,h) = h is quasiconvex and Theorem [3l we have the following corollary, 
which was also proved in P, [131 EI] ■ 

Corollary 1: There exists an 0{n)-time algorithm for finding an index interval [i,j] maximiz- 
ing hit{i,j) subject to length{i, j) > L. 
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We next prove that if all index intervals with lengths > L have negative hit-supports, then 
the optimal solution must have length less than 2L. 

Lemma 7 : If hit{p, q) < holds for each index interval [p, q] of length at least L, then 
length{p* , q*) < 2L, where {p*,q*) = arg max ecc{p,q). 

length{p,q)>L 

Proof: Let {p*,q*) = arg max ecc{p,q). Suppose for contradiction that length{p* , q*) > 

length{p,q)>L 

2L. Let Ci = l{p*+q*)/2\ and C2 = Ci + 1. Then we have length{p*, ci) > L and length{c2, q*) > 
L Since = hit^p* ,cx)+hit{c2,q'') y^^^^ 

length{p* ,q*) length{p* ,ci)+length{c2,q*) '' 

hit{p*,q*) ^ hit{p*,ci) hit{p*,q*) ^ hit{c2,q*) 



length{p* , q*) length{p* , Ci) length(p* , q*) length{c2, q*) 

Without loss of generality, we assume , w L < t-^^^^^Vt^- Since , ^^^^L \\ < , ^^^if^ 't^^ ^ < 

° ' lengtn{p' ,q* } — lengtn(p' ,ci) lengtn(p* ,q* ) — lengtn(p* ,ci) 

and ^length{p* , q*) > ^^/length{p* , ci), we have 



^y length{p* , q*) ■ hit{p*, q*) ^ ^/ length{p* , Ci) ■ hit{p*, Ci] 



length{p* ^ q*) length(p* , Ci] 

hit{p*,q*) hit{p*,ci) 

■v=^ : < 



^yiength{p* , q*) ^yiength{p* , Ci) 
■vv- ecc{p*,q*) < ecc{p*,Ci). 

It contradicts {p*,q*) = arg max ecc{p,q). □ 

Zen34/i.(p,5)>L 



3.2 Subroutine 

In the following we give a new algorithm for the Min-Plus Convolution Problem, which 
will serve as a subroutine in our algorithm for the PSEI problem. The min-plus convolution of 
two vectors x = (xq, Xi, . . . , and y = (yo, Vi, ■ ■ ■ , Vn-i) is a vector z = (^o, 2^1, • • • , Zn-i) 

such that Zk = minf^Q{xj + ?/A:-i} for /c = 0, 1, . . . , n — L Given two vectors x = (xq, Xi, . . . , x„_i) 
and y = iyo,yi, ■ ■ . ,yn~i), the Min-Plus Convolution Problem is to compute the min- 
plus convolution z of x and y. This problem has appeared in the literature with various 
names such as "minimum convolution," "epigraphical sum," "inf- convolution," and "lowest 
midpoint" P HI [H [231 [2ll [251 [26] . Although it is easy to obtain an 0{n'^)-time algorithm, no 
subquadratic algorithm was known until recently Bremner et al. [7] proposed an 0{n'^ / \ogn)- 
time algorithm. In the following, we shall give an 0(?T,^/^T(n^/^))-time algorithm for the MlN- 
Plus Convolution Problem, where T{n) is the time required to solve the all-pairs shortest 
paths problem on a graph of n nodes. To date, the best algorithm for computing the all-pairs 
shortest paths problem on a graph of n nodes runs in 0{'n? '^^°^^°^2 ) time [S]. Thus, our work 
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implies an 0{n'^ °^f^°^y2 )-time algorithm for the min-plus convolution problem, which is slightly 
superior to the first subquadratic 0{n^ / log n)-time algorithm recently proposed by Bremner 
et al. [7j. 

Definition 6: The min-plus product BC of a d x n' matrix B = [bij] and an n' x d matrix 

C = [cij] is a d X d matrix D = [dij] where dij = mm^~Q {bi^k + Ck,j}- 

Note that the notion of "min-plus product" is different from the notion of "min-plus con- 
volution" . It is well known [T] that the time complexity of computing the min-plus product of 
two n' X n' matrices is asymptotically equal to that of computing all pairs shortest paths for a 
graph with n' vertices. The next lemma was proved by Takaoka in [27] . The proof of the next 
lemma was also given in [27], and we include it here for completeness. 

Lemma 8: [27] Given a T(n')-time algorithm for computing the min-plus product of any two 
n' X n' matrices, the computation of the min-plus product of B and C, where B is a, d x n' 
matrix and C is an n' x c? matrix, can be done in 0{^T{d)) time if d < n' . 

Proof: For simplicity we assume that d divides n. We first split B into n'/d matrices 
Bi, . . . , Bni/d of dimension d x d and C into n'/d matrices Ci, . . . , Cn'/d of dimension d x d. 
Then we can compute {BiCi, B2C2, . . . , Bn' /dCn' /d} in 0{dT{n' / d)) time by the given algo- 
rithm. The (i, j)-th entry of the min-plus product of B and C is min^^J^jthe (i, j)-th entry of 
BkCk}- □ 

Our new algorithm for the Min-Plus CONVOLUTION Problem is as follows. 

Algorithm MinPlusConvolution 

Input: X = (xo, Xi, . . . , and y = (yo, yi, ■ ■ ■ , yn-i)- 

Output: z = {zq, zi, . . . , Zn-i) such that Zk = minf^Qjxj -|- yk-i} for = 0, 1, . . . , n — 1. 
1: Construct an [n""^/^] x (2n — 1) matrix B = [bij] such that the i*^ row of B 

is equal to (cxo, . . . , 00, xq, xi, . . . , a;„-i, bo, . ?. , 00) for i = 0, 1, . . . , [n^/^] — 1. 

2: Construct a {2n — 1) x [n^/^] matrix C = [cij] such that the transpose 

j 

of the j*^ column of C is equal to (bo, . T. , 00, yn-i, yn-2, ■ ■ ■ ,yo, 00, . . . , 00) 
for j = 0, 1, . . . , _ 1. 

3: Let D = [dij] be the min-plus product of B and C. 
4: For A; = 0, 1, . . . , n - 1 do 

Find i,j such that k = i x \n^^'^~\ + j, where < j < [n^/^]. 
Set Zk to dij. 
5: Output z = {zo, zi,..., Zn-i). 
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The following lemma ensures the correctness. 

Lemma 9: In MinPlusConvolution, dij 
+ J <n-l. 

Proof: 

2n-l 

dii = mm{b^ + Ctj} 

n-2-jx[ni/2] n+j-1 2n~l 

= min{ min {bi^t + Qj}, min {bu + Ctj}, min {k^t + Ctj}} 

t=0 t=n-l-ix\n^/^] ' t=n+j 

n-2-jx[ni/2] n+j-1 2n~l 

= min{ min {oo + q,}, min {bi^t + Ctj}, rain {bi^t + oo}} 

*=0 f=„_l_jx[ni/2] t=n+j 

n+j — 1 

= min {bi^t + Qj} 

= min{Xo + yix\n^/'2-\+j,^l + Uixln^/^l+j^l 1" ^ixln^/^l+j + ?/o} 



min K + ?/ixrn.V2]+j-J 



□ 



We now analyze the time complexity. Let T{n) denote the time required to compute the 
min-plus product of two n x n matrices. Steps 1 and 2 take 0(n^/^) time, and by Lemma [8], 
Step 3 takes 0{^T{\n^/^])) = 0{n^/'^T{n^/'^)) time. Steps 4 and 5 take 0(n) time. Therefore, 
the total running time is 0{n^/'^T{n^/'^) + rr'/'^). Since T{n) = n{n^), we have 0{n^/^T{n^/^) + 
7^3/2) = 0{n^/'^T{n^/'^)). Theorem H summarizes our results for the Min-Plus Convolution 
Problem. 

Theorem 4: The running time of Algorithm MinPlusConvolution is 0(r2^/^T(n^/^)), 
where T{n) is the time required to compute the min-plus product of two n x n matrices. 

The next Lemma was proved by Bergkvist and Damaschke in [1]. 

Lemma 10: [4j Given a sequence H = {hi, /i2, • • • , hn), the Maximum Consecutive Sums 
Problem is to compute a sequence {wi, W2, . . . , w„) where Wi = maxj^^^p hj : length{p, q) = 
i} for each z = 1, 2, . . . , n. The Maximum Consecutive Sums Problem can be reduced to 
the Min-Plus Convolution Problem in linear time. 

Corollary 2: Given a sequence H = (hi, h2, ■ ■ ■ , hn), we can compute in 0(n^/^T(n^/^)) time 
a sequence {wi,W2, ■ ■ ■ ,Wn) such that Wi = niaxj^^^^ /ij : length{p,q) = i} for each i = 
1,2, ... ,n by making use of the Alogirthm MinPlusConvolution, where T(n) is the time 
required to compute the min-plus product of two n x n matrices. 

Proof: Immediately from Theorem H] and Lemma [TOl □ 
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3.3 Algorithm 

We next show how to solve the PSEI problem in 0{n^^j^^^) time, where T(n') is the time 
required to compute the min-plus product of two n' x n' matrices. To avoid notational overload, 
we assume that 4L divides n. By Corollary [H we can find an index interval maximizing 
hit{i,j) subject to length{i, j) > L in linear time. Then there are three cases to consider: (1) 
hit{i,j) = 0; (2) hit{i,j) > 0; (3) hit{i,j) < 0. If it is Case 1, then must be an optimal 
solution, and we are done. If it is Case 2, then we know there is at least one index interval 
satisfying the length constraint with positive hit-support. Define /(£, h) by 



A ifh>0; 
otherwise, 



By Lemma [6l / is quasiconvex, so we can call MSl{D,L,f) to find the index interval 
maximizing f{i',j') subject to length{i' , j') > L in linear time. Clearly, is an optimal 

solution, and we are done. If it is Case 3, i.e., all index intervals satisfying the length constraint 
have negative hit-supports, we do the following. First, by letting be the index interval 
[2kL + 1, 2kL + 4L] for each = 0, 1, . . . , ^ — 2, we can divide the whole index interval [1, n] 
into ^ — 1 subintervals, each of length 4L. By making use of Corollary [2], we are able to compute 
the index interval [ik,jk] ^ h maximizing the eccentricity subject to the length constraint in 
0{L^^'^T{L^/'^)) time for each = 0, 1, . . . , ^ — 2. According to Lemma O some [ik.jk] must 
be the optimal solution. The detailed algorithm is given below. 

Algorithm ComputePSEI 

Input: A plain sequence D of n number pairs and a length lower bound L with 1 < L < n. 
Output: An index interval / = maximizing ecc{i,j) subject to length{i, j) > L. 
1: (i,j)^arg max hit{p,q). 

length{p,q)>L 

If hit{i,j) = 0, then return 
If hit{i,j) > then 

1: define / : M+ x M by letting /(£, /i) = ^ if /i > and 0, otherwise; 
2: return MSI(D,L,/). 
4: For k from to ^ - 2, 

1. compute {wi, . . . , Wn), where Wj = max{hit{p, q)\ length{p, q) = j and 2kL + 1 < 
p < q < 2kL + AL] for each j = 1, . . . , 4L; 

2. compute eccj = ^ for each j = L, L + 1, . . . ,2L — 1; 

3. jk^ aigmax^^'j} ecCj] 

A ■ 2kL+AL-ji,+l /• • I ■ i\ 

4. Ik ^ arg max.^2fcL+i ecc(«, 2 + - !)• 
5: Return the max eccentricity interval in {[zq, jo]) • • • i [^^-25^^-2]}- 
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Theorem 5: Algorithm ComputePSEI solves the PSEI problem in 0{n ^^/^ ' ) time, where 
T(n') is the time required to compute the min-plus product of two n' x n' matrices. 

Proof: We begin by considering the correctness. Let = arg max hit{p,q) and 

length{p,q)>L 

{i*,j*) = ^T^S Kiax ecc{p,q). 

length(p ,q)>L 

In the case where hit(i,j) = 0, we have ecc(p,q) = —-h^M^^= < for all (p,q) with 

y/length{p,q) 

length{p,q) > L. It follows that > ecc{i*,j*) > ecc{i,j) = 0, so index interval is an 
optimal solution. 

In the case where hit{i,j) > 0, we have ecc{i*,j*) > ecc{i,j) > 0. Let f{i,h) = ^ if 
h > and 0, otherwise. By Lemma O / is quasiconvex, so the call to MSI(D, L, /) will 
return an index interval maximizing f{i',j') subject to length{i' , j') > L. We next prove 

that ecc{i',j') > ecc{i*,j*), so index interval is an optimal solution. Note that for any 

index interval [p,q], we have f{p,q) = ecc{p,q) as long as ecc{p,q) > or f{p,q) > 0. Since 
ecc{i*,j*) > 0, we have ecc{i*,j*) = f{i*,j*) > 0. It follows that f{i',j') > f{i*,j*) > 0, which 
implies that ecc{i',j') = f{i',j'). Therefore, ecc{i',j') = f{i',j') > f{i*,j*) = ecc{i*,j*). 

In the case where hit{i,j) < 0, we have length{i* , j*) < 2L by Lemma [71 It follows that 
must be contained in [2kL + 1, 2kL + 4L] for some G {0, 1, . . . , ^ - 2} and thus the 
index interval returned at Step 5 must be an optimal solution. 

We next analyze the running time. By Corollary [1], Step 1 takes 0{n) time. Step 2 takes 
constant time. By Theorem[3l Step 3 takes 0{n) time. By Corollary^ each iteration of the loop 
at Step 4 takes 0{L^/^T{L^/^) + L) time. Thus Step 4 takes 0{^L^/^T{L^/^) + n) = 0{n^^fi) 
time. Step 5 takes 0{n/ L) time. Therefore the total running time is 0{n-jYrT^)- D 



4 Concluding remarks 

To the best of our knowledge, there is not any non-trivial lower bound for the PSEI problem 
proved so far. Thus, there is still a large gap between the trivial lower bound of 0{n) and the 
upper bound of 0{nLs^^^^^^Y~) fo^ the PSEI problem. Bridging this gap remains an open 
problem. 
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